2 research outputs found

    An Improved Multiple Features and Machine Learning-Based Approach for Detecting Clickbait News on Social Networks

    Get PDF
    The widespread usage of social media has led to the increasing popularity of online advertisements, which have been accompanied by a disturbing spread of clickbait headlines. Clickbait dissatisfies users because the article content does not match their expectation. Detecting clickbait posts in online social networks is an important task to fight this issue. Clickbait posts use phrases that are mainly posted to attract a userā€™s attention in order to click onto a specific fake link/website. That means clickbait headlines utilize misleading titles, which could carry hidden important information from the target website. It is very difficult to recognize these clickbait headlines manually. Therefore, there is a need for an intelligent method to detect clickbait and fake advertisements on social networks. Several machine learning methods have been applied for this detection purpose. However, the obtained performance (accuracy) only reached 87% and still needs to be improved. In addition, most of the existing studies were conducted on English headlines and contents. Few studies focused specifically on detecting clickbait headlines in Arabic. Therefore, this study constructed the first Arabic clickbait headline news dataset and presents an improved multiple feature-based approach for detecting clickbait news on social networks in Arabic language. The proposed approach includes three main phases: data collection, data preparation, and machine learning model training and testing phases. The collected dataset included 54,893 Arabic news items from Twitter (after preprocessing). Among these news items, 23,981 were clickbait news (43.69%) and 30,912 were legitimate news (56.31%). This dataset was pre-processed and then the most important features were selected using the ANOVA F-test. Several machine learning (ML) methods were then applied with hyperparameter tuning methods to ensure finding the optimal settings. Finally, the ML models were evaluated, and the overall performance is reported in this paper. The experimental results show that the Support Vector Machine (SVM) with the top 10% of ANOVA F-test features (user-based features (UFs) and content-based features (CFs)) obtained the best performance and achieved 92.16% of detection accuracy

    Misbehavior-aware on-demand collaborative intrusion detection system using distributed ensemble learning for VANET

    Get PDF
    Vehicular ad hoc networks (VANETs) play an important role as enabling technology for future cooperative intelligent transportation systems (CITSs). Vehicles in VANETs share real-time information about their movement state, traffic situation, and road conditions. However, VANETs are susceptible to the cyberattacks that create life threatening situations and/or cause road congestion. Intrusion detection systems (IDSs) that rely on the cooperation between vehicles to detect intruders, were the most suggested security solutions for VANET. Unfortunately, existing cooperative IDSs (CIDSs) are vulnerable to the legitimate yet compromised collaborators that share misleading and manipulated information and disrupt the IDSsā€™ normal operation. As such, this paper proposes a misbehavior-aware on-demand collaborative intrusion detection system (MA-CIDS) based on the concept of distributed ensemble learning. That is, vehicles individually use the random forest algorithm to train local IDS classifiers and share their locally trained classifiers on-demand with the vehicles in their vicinity, which reduces the communication overhead. Once received, the performance of the classifiers is evaluated using the local testing dataset in the receiving vehicle. The evaluation values are used as a trustworthiness factor and used to rank the received classifiers. The classifiers that deviate much from the box-and-whisker plot lower boundary are excluded from the set of the collaborators. Then, each vehicle constructs an ensemble of weighted random forest-based classifiers that encompasses the locally and remotely trained classifiers. The outputs of the classifiers are aggregated using a robust weighted voting scheme. Extensive simulations were conducted utilizing the network security laboratory-knowledge discovery data mining (NSL-KDD) dataset to evaluate the performance of the proposed MA-CIDS model. The obtained results show that MA-CIDS performs better than the other existing models in terms of effectiveness and efficiency for VANET
    corecore